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ABSTRACT 

Some  simple  procedures  are  provided  for  establishing  the  asymptotic 
normality  and  uniform  strong  convergence  of  a  class  of  functions  that  arise  in 
the  context  of  estimating  parameters  from  a  type  II  censored  sample.  These 
are  used  to  streamline  and  strengthen  the  traditional  treatment  of  the 
asymptotic  theory  of  maximum  likelihood  estimators  based  on  censored  data. 
Further  applications  include  the  treatment  of  asymptotics  of  some  modified 
maximum  likelihood  (MML)  estimators.  In  particular,  conditions  are  provided 
for  the  consistency  and  limiting  normality  of  the  MML  estimators  of  Mehrotra 
and  Nanda,  and  the  asymptotic  efficiencies  of  these  estimators  are  evaluated. 
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SIGNIFICANCE  AND  EXPLANATION 


In  reliability  analysis,  inferences  on  the  parameters  of  a  life 
distribution  often  have  to  be  based  on  censored  or  incompletely  observed 
data.  Under  the  censoring  scheme  that  permits  observation  of  a  predetermined 
number  of  failures,  several  modifications  of  the  maximum  likelihood  method 
were  proposed  with  the  goal  of  obtaining  estimators  that  are  relatively  easy 
to  apply.  Previous  works  on  the  large-sairple  properties  of  maximum  likelihood 
and  modified  maximum  likelihood  estimators  have  been  rather  sketchy,  and  the 
methods  too  cumbersome  to  employ  in  multiple-censoring  situations. 

This  paper  develops  a  simple  yet  versatile  approach  that  permits  a 
unified  treatment  of  the  large-sample  properties  of  both  maximum  likelihood 
and  modified  maximum  likelihood  estimators  based  on  censored  data.  It  is  also 
flexible  enough  to  accommodate  multiple  censoring.  In  addition  to  providing 
an  Improved  theoretical  treatment,  the  results  help  fill  a  gap  of  knowledge  in 
regard  to  the  performance  of  the  modified  maximum  likelihood  estimators, 
especially  their  loss  of  efficiency  in  relation  to  the  severity  of  censoring. 
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ON  THE  ASYMPTOTICS  OF  MAXIMUM  LIKELIHOOD 
AND  RELATED  ESTIMATORS  BASED  CM  TYPE  II  CENSORED  DATA 

Gouri  K.  Bhattacharyya 

1 .  INTRODUCTION 

In  a  life-test  setting,  suppose  the  failure  times  of  n  units  be  me deled 
as  independent  random  variables  X Xn  having  a  common  continuous  dis¬ 
tribution.  In  general  term,  a  type  II  censored  sample  refers  to  a  specified 
subset  of  the  order  statistics  Y.  <...<  Y  of  X.,...,X  .  The  most  common 

i  n  1  n 

situation  is  censoring  on  the  right  which  permits  the  first  r  (<  n)  order 
statistics  to  be  observed.  However,  left  as  well  as  multiple  censoring  are 
also  often  used. 

Letting  f(x,8)  and  F(x,0)  denote  the  probability  density  function 
(pdf)  and  the  distribution  function  <df)  of  the  failure  time,  the  log-likeli¬ 
hood  of  a  type  II  right  censored  sample  is 

r  __ 

1(0)  S  log[nJ/(n~r)J]  +  \  log  f(Y. ,0)  +  (n-r)  log  F(Y  ,0)  (1.1) 

i-1  r 

where  F  *  1-F.  Maximum  likelihood  (ML)  estimation  under  specific  parametric 
models  of  the  life  distribution  is  widely  discussed  in  the  reliability  litera¬ 
ture.  In  regard  to  the  asymptotic  theory  of  ML,  the  standard  theorems  do  not 
apply  to  (1.1)  because  its  terms  are  neither  independent  nor  identically  dis¬ 
tributed.  The  work  of  Halperin  (1952)  still  remains  the  universal  reference 
for  a  rigorous  treatment  of  the  asymptotics  of  ML  in  this  context.  However, 
Halperin's  approach,  which  rests  on  asymptotic  expansions  of  certain 
characteristic  functions,  involves  quite  tedious  manipulations  even  for  the 
simplest  case  where  6  is  real  and  the  sample  is  single-censored. 
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A  fair  amount  of  work  has  grown  on  another  front.  In  order  to  reduce  the 


computing  job  of  itertively  solving  the  likelihood  equation  or  to  obtain 
simple  estimators  that  permit  a  grip  on  their  small-san$>le  properties,  some 
modified  maximum  likelihood  (MML)  estimators  have  been  proposed.  These  are 
generally  targeted  for  location-scale  models,  and  especially,  for  the  normal 
distribution.  One  interesting  construct  is  due  to  Mehrotra  and  Nanda  (1974) 
who  replace  the  ’hazard-rate'  term  3  log  F(Yr»8)/38,  that  appears  in  the 
likelihood  equation,  by  its  expectation.  While  the  small  sample  properties  of 
these  NHL  estimators  have  been  studied  for  some  particular  models,  their 
asymptotic  efficiencies  have  not  been  investigated.  Another  type  of  MML 
estimators,  due  to  Tiku  (1967,  1976),  derives  from  a  linear  approximation  of 
the  hazard-rate  term.  Although  extensive  simulation  studies  of  these  estima¬ 
tors  have  been  reported,  a  careful  treatment  of  the  asymptotic  theory  is 
lacking. 

The  objectives  of  the  present  paper  are  twofold.  First,  we  streamline 
and  strengthen  the  traditional  treatment  of  the  asymptotics  of  ML  estimators 
with  type  II  censored  data.  At  the  same  time,  we  provide  a  general  setting  in 
which  the  asymptotics  of  ML,  the  aforementioned  MML's  or  other  perturbations 
of  the  estimating  equation  can  be  treated  in  a  unified  way.  Our  second  goal 
is  to  derive  the  asymptotic  efficiency  (AE)  of  the  Mehrotra  and  Nanda  MML 
estimator  and  study  the  effect  of  the  amount  of  censoring  on  the  AE. 

The  key  results  about  limiting  normality  and  uniform  strong  convergence 
are  developed  in  Section  2.  These  are  appropriately  specialized  in  Sections  3 
and  4  to  handle  the  ML  and  MML  estimators.  Our  treatment  of  asymptotic 
normality  is  based  on  a  result  of  Sethuraman  (1961)  concerning  the  conditional 
and  joint  limit  distribution  of  random  vectors.  It  yields  a  considerable 
simplif ication  over  Halperin's  treatment,  almost  to  the  level  of  simplicity  of 
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the  lid  case.  Moreover,  it  makes  the  adaptation  of  the  results  to  the  multi- 
censoring  case  quite  transparent.  Incidentally,  Fligner  and  Hettmansperger 
(1979)  made  another  fruitful  use  of  this  approach  in  the  context  of  some  rank 
statistics. 

Results  for  the  Mehrotra  and  Nanda  MML  estimator  are  developed  in  Section 
4  in  a  general  setting,  and  then  specialized  to  the  location  and  scale  para¬ 
meters  in  Section  5.  Numerical  computations  of  the  asymptotic  variance  and  AE 
are  presented  in  Section  5  under  the  normal  model  for  which  these  estimators 
were  found  to  have  nice  small-sample  properties.  These  supplement  the  small- 
s ample  variance  and  efficiency  calculations  of  Mehrotra  and  Nanda  (1974),  and 
show  the  effect  of  the  amount  of  censoring.  Section  6  briefly  indicates  how 
the  asymptotics  of  Tiku's  MML  estimator  follow  from  our  results. 


2.  THE  PRINCIPAL  TOOLS 

This  section  provides  two  results  which  are  basic  to  the  development  of 

asymptotic  theory  of  ML  and  MML  estimators  based  on  type  II  censored  data. 

First  we  introduce  some  notation  and  assumptions.  The  parameter  B  is  taken 

to  be  a  k-vector  and  the  true  value  0Q  is  assumed  to  be  an  interior  point  of 

k 

the  parameter  space  Q  c  R  .  An  important  role  will  be  played  by  random 
vectors  of  the  form 

.  r 

T  (0)  *  n  [  l  g(Y ,0 )  +  (n-r)h(Y  ,0)]  (2.1) 

"  i=1 

where  g  and  h  are  functions  on  X  *  fl  ♦  R*,  and  X  denotes  the  sample 
space  of  .  For  simplicity,  0Q  will  often  be  suppressed  in  functional 
notation.  For  instance,  f(x)  will  stand  for  f(x,0Q),  g(x)  for  g(x,0o), 
and 

r 

T  -  T  (0  )  -  n  (  ),  g(Y  )  +  (n-r)h(Y  )]  . 

n  no  j  j  r 


(2.2) 
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On  less  specified  otherwise,  all  limits  are  taken  as  n  -*•  •,  and  r  is  taken 

to  be  [np] ,  the  integer  part  of  np  where  p  e  (0,1)  is  fixed.  Throughout 

it  is  assumed  that  f(x)  is  continuous  at  its  p-quantile  (,  and  f(?)  >  0. 

V 

The  notation  ♦  will  be  used  for  convergence  in  distribution  to  a  k- 

variate  normal  with  mean  y  and  covariance  matrix  E.  Vectors  will  be 
written  as  column  vectors  and  a  transpose  will  be  denoted  by  *. 

Our  first  concern  is  with  the  asymptotic  distribution  of  Tn  defined  in 
(2.2).  Instead  of  working  with  sophisticated  limit  theorems  for  functions  of 
order  statistics,  we  provide  an  elementary  treatment  by  means  of  conditioning 
on  Yr.  A  result  of  Sethuraraan  (1961),  stated  in  Lemma  1,  would  be  instru¬ 
mental  to  our  approach. 

Lemma  1.  Let  (Cn)  and  (nn)  be  sequences  of  random  t-  and  m-vectors 
defined  on  a  probability  space.  If  (a)  for  arbitrary  t  e  r”1,  the  con¬ 
ditional  distribution  of  £n,  given  nn  *  t,  converges  to  ^(Bt,!),  and 

(b)  n  strongly  converges  in  distribution  to  N  (0,A),  then 
n  ® 

V  * 

K  *  N.(0,T  +  BAB  ). 
n  k 

Referring  to  (2.2),  let  gtf  and  hQ  denote  the  ath  coordinate  of  g 

and  h,  o  *  1, . . . ,£.  Theorem  1  establishes  the  asymptotic  normality  of  Tn. 

Theorem  1.  For  a  -  1,...,t,  assume  that  (i)  h^(x)  =  dho(x)/dx  exists 

21  2 

at  x  ■  C,  (ii)  90(x)  is  continuous  at  ?,  and  (iii)  gfl(x)f  (x)dx  <  «*. 

V 

Then  n  MT  -y)  ♦  N,(0,I)  where 
n  * 


y  ■  J^-g(x)f (x)dx  +  qh(?) 


I  -  T  +  pqf"2(?)bb* 


'  •  V 
>>* 


/-* 

& 


g(x)g*(x)f  (x)dx  -  p~ 1  (/^ g(x)f  (x)dx)  ( J  jj^gfxjf  (x)dx) 
f(?)g(?)  -  p_1f(C  )J^,9(x)f  (x)dx  +  gh' (?) 
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(2.3) 


Proof .  Observe  that  a  linear  function  c  Tn,  of  the  components  of  T„ 
is  again  a  function  of  the  form  (2.2)  with  1=1,  and  c*g  and  c*h  in 


places  of  g  and  h.  Therefore,  it  suffices  to  prove  the  result  for  the  one¬ 
dimensional  case  (1  =  1).  We  take  g,  h,  and  Tn  to  be  real-valued 
functions  for  the  rest  of  the  proof . 


Letting  hR  *  n^(Yr~C),  consider  first  the  conditional  limit  distribu- 

y,  _  Vo 

tion  of  n  z  (T  -y).  Given  t|  =  t  or  equivalently  Y  =  c  =  c  +  tn  ^  , 
n  n  J  r  n 

the  random  variables  are  distributed  as  the  order  statistics  of 

a  random  sample  of  size  r  -  1  from  the  truncated  pdf  f(x)/F(Cn),  x  <  Cn> 


F(cn>  =  pn  and  denote  the  truncated  moments  of  g(x)  as 


-1  ,?n 


'  I  I*  -  , 

V1n=Pn 


-1  ,?n  2  _ ,  2 

v_  =  p  J  g  fdx  - 
2n  *n  1  In 


have  the  decomposition  n  ^(T^-y)  =  A1n  +  A2n  where 


A,  -  nJ/2  \  tg(Y.  )  -  v,  ] 
in  ,  .  i  In 

i=1 


=  n/z  v  +  h ( C  )  -  y]  +  n"  g(?  ) 

n  in  n  -  - 


Conditionally,  given  nn=t,  A^n  is  a  sum  of  iid  components  centered  at  the 
mean.  We  are  in  fact  dealing  with  a  triangular  array  since  the  common 
distribution  depends  on  n.  By  virtue  of  the  assumptions  (ii)  and  (iii),  the 


n 


Lindeberg-Feller  central  limit  theorem  applies,  and  we  have  that 

V 

A1n  *  ni(°,t)  where 

T  =  p(lim  )  =  JlooV^Edx  -  p  gfdx)^ 

The  constant  A2n  can  be  written  as 
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A2n  *  "  2  ^  Pn  i-  *fdx  -  r^dx] 

+  n/2  [iSlEl  h(?  >  _  qh ( £ ) ]  +  n^2  g(?  )  . 

n  n  n 

v. 

Using  assumptions  (i)  and  (ii)  along  with  the  relation  n^  (£n"C )  =  t,  it  is 

straightforward  to  see  that  the  three  terms  on  the  rhs  converge  to 

tf(z)[g(C)  “  P  ^j^gfdx],  tqh'(C)  and  0  respectively.  Consequently,  the 

1/- 

conditional  limit  distribution  of  n  z  (T  -p)  is  N  (bt,x)  where 

n  i 

b  *  f(?)g(0  -  p_1f(C)  j^gfdx  +  qh'(C)  . 

-2 

Finally,  hn  converges  in  density  to  N^tOjpqf  (?))  which  implies 

strong  convergence  of  nn  in  distribution  in  the  sense  of  Sethuraman 

V, 

(1961).  An  application  of  Lemma  1  yields  that  n  ^  (T^-p)  is  asymptotically 

-2  2 

normal  with  mean  0  and  variance  t  +  pqf  (C)b  .  |  | 

Remark.  The  above  approach  to  proving  Theorem  1  readily  extends  to  the 
case  of  multiple  censoring.  To  illustrate,  let  us  consider  left  censoring  at 
the  observation  r1  *  [np^]  and  right  censoring  at  r2  =  [np2l ,  0  <  Pi  <  P2  < 
In  this  case,  the  relevant  form  of  Tn  is 

-1  *2 

n  [r1h1(Yr  )  +  \  g(Yi>  +  (n-r2)h2(Yr  ))  . 

1  i=r 1  2 

With  ?a  denoting  the  p0~quantile  of  f(x),  a  =  1,2,  the  vector  nn  = 

1/  * 

n  2  (Y  -C„,Y  -E„)  converges  in  density  to  a  bivariate  normal  N  (0,A), 

r1  1  r2  2  1 
say.  Conditionally,  given  nn  =  t,  the  middle  term  of  Tn  is  distributed  as 

a  sum  of  iid  components  where  the  parent  distribution  has  the  doubly  truncated 

pdf  f(x)/(F(C2)  -  F(C t ) J ,  <  x  <  ?2.  Following  the  steps  of  proof  of 

Theorem  1,  one  arrives  at  the  asymptotic  normality  of  Tn  along  with  an 

explicit  expression  for  the  asymptotic  covariance  matrix.  | | 


PW!* 5* LW.V r  .«. flP TTT 
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Our  second  result  pertains  to  the  convergence  of  Tn(0),  defined  In 
(2.1),  with  probability  1  (w.p.1)  uniformly  in  a  compact  neighborhood  of  8q. 
Since  the  components  of  a  vector  Tn(8)  can  be  individually  treated  for  this 
purpose,  it  suffices  to  consider  the  case  of  real-valued  functions  g  and 
h  in  Theorem  2. 

Theorem  2.  Assume  that  for  a  contact  neighborhood  B  of  0Q, 

(i)  g(x,0)  is  continuous  in  9  e  B  for  every  x, 

(ii)  for  8  €  B  and  all  x,  |g(x,0)|  <  gQ(x)  such  that 
J"  g  (x)f (x)dx  <  », 

(iii)  h(x,0)  is  continuous  on  tC-c^C+e^  x  B  for  some  >  0. 

Then 


where 


sup  |T  (8)  -  y(0)|  -*•  0  w.p.1 

8eB  n 


|i(6)  *  j|vg(x,0)f (x)dx  +  qh(£,0) 


(2.4) 


(2.5) 


Proof.  Referring  to  (2.1),  let  T^O)  *  n  \  g(Yi,0)  and  T^tS)  = 

i=1 

n  1(n-r)h(Y  ,8)  so  T  (0 )  =  T  (0 )  +  T  (8 ).  With  X.(*)  denoting  the 
r  n  In  2n  a 

indicator  function  of  the  set  A,  we  have  the  representation 

T1n(9)  =  n_1  j,  g(Xi,0)XA(Yr)(Xi) 

where  A(Yr)  =»  (-**,Y  ].  Consider  its  approximation  by  5 

-1  n 

n  \  g (X. ,8>x_ . (X. )  which  is  an  average  of  iid  components  that  are 
^  ^  A  V  s  *  ^ 

continuous  in  8  and  bounded  by  the  integrable  gQ.  The  uniform  strong  law 


(cf.  Jennrich  1969)  yields 


sup  |T°  (0)  -  J?  g(x,0 )f (x)dx|  ♦  0  w.p.1  . 

0eB  1n  - 

Now,  for  an  arbitrary  e  >  0,  as  n  +  •,  Yr  lies  outside  the  interval 
C  ±  e  at  most  finitely  many  times  w.p.1.  Consequently,  for  sufficiently 
large  n. 
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1„<9’  -  T°„<911  ‘  l«<Xi-6,|Xlt-c.C*£l(Xi> 


<  K  n”1  )  X,  ,(X) 

L  A  f r-r  1  \ 


n 


i=i 


[?-e,i;+e]  i 


where  K  bounds  g(x,0)  on  [£-€,£+€]  *  B.  The  last  expression  converges  to 

K[F(?+e)  -  F(C-e)]  w.p.1.  By  letting  e  ♦  0,  we  therefore  obtain  that 

|T^(0)  -  T^n ( 8 )  |  ♦  0  w.p.1  uniformly  in  B.  With  regard  to  T2n(0),  note 

that  sup  |h(x,0)  -  h(C,0)|  is  a  continuous  function  a  e  {C-e^,C+e^J. 

0eB 

Since  Y  +  £  w.p.1,  we  have  that  |T2  (0)  "  qh(C,0)|  0  w.p.1  uniformly  in 

0  e  B.  By  combining  the  two  parts,  the  proof  is  conqp  j  | 


3.  APPLICATION  TO  MLE 

We  proceed  to  show  how  the  asymptotic  normality  and  consistency  results 
for  the  maximum  likelihood  estimator  (MLE)  in  the  type  II  censored  situation 
can  be  obtained  from  simple  adaptations  of  Theorems  1  and  2.  It  would  be 
enough  to  outline  the  main  steps  because  the  details  are  analogous  to  the 
treatment  of  MLE  in  the  iid  case.  Moreover,  instead  of  displaying  a  composite 
list  of  regularity  conditions,  it  would  be  more  instructive  to  state  them  as 
and  when  they  are  needed. 

Henceforth,  we  use  an  upper  dot  for  the  derivative  of  a  function  wrt  0, 
and  two  dots  for  the  second  derivative.  Referring  to  (1.1)  and  denoting 

tp(x,0)  =  log  f(x,0),  p(x,0)  =  log  F(x,0)  ,  (3.1) 

the  likelihood  equation  is  given  by 

•  f  • 

Jt  (0)  =  l  i>( Y  ,0)  +  (n-r)p(Y  ,8)  =  0  .  (3.2) 

n  i=1  1  r 

Since  0  is  a  k-vector,  so  is  i  (0)  while  l  (0)  will  be  a  k  x  k 

n  n 

matrix.  As  before,  the  true  0Q  will  often  be  suppressed  in  notation,  for 
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\ 


V 


instance,  $>(x)  =  $(x,9q)  . 

First,  we  establish  the  asymptotic  normality  of 


v,  - 

n/2  (9 


-0  )  assuming 
no 


that  is  a  consistent  sequence  of  roots.  A  Taylor  expansion  of 


i  (9  )  =  0  around  9  yields 


J/2 1  (9  )  =  r  [n/2  (9  -9  )] 


- 1  “ 

where  T  is  the  random  matrix  -n  l  (•)  evaluated  on  the  line  segment, 
n  n 

between  9  and  6n.  Now,  n  ^  i  (9  )  is  a  vector-valued  function  of  the 
n  0  no 

form  (2.2)  with  l  =  k,  g  =  and  h  =  p.  Theorem  1,  specialized  to  these 
g  and  h,  readily  yields  the  limiting  normality.  To  obtain  explicit 
expressions  for  the  mean  and  covariance,  we  assume  that  the  derivative 


9  I  x 


J_0#f(y,9)dy  can  be  carried  within  the  integral.  This  yields 
h(x)  =  p  ( x )  =  ~[F(x) ]  1  J*  g(y  )f  (y  )dy  , 


h ' (x)  =  [F( x ) ]  (-g(x)f(x)  +  h(x)] 


(3.4) 


Using  these  results  for  x  *  X.,  the  expressions  (2.3)  for  the  present  case 
reduce  to 

li  =  0,  b  =  -(pq)  ^(O  fdx 

(3.5) 

E  =  J  =  $  fdx  +  q  1  (J  ^^fdx)  ( J^i}»fdx) 

1/,  *  V  -  ■) 

To  establish  that  n  ^  (9-9  )  +  N  (0,J  ),  it  remains  to  show  that 

no  k 

- 1  ” 

T  -*■  J  w.p.l.  To  this  end,  we  note  that  n  l  (9)  is  a  (matrix-valued) 
n  n 

function  of  the  form  T^(9)  given  in  (2.1)  with  g  =  ifi(x,9)  and 


h  =  J5(x,9).  Denoting 


-J(8)  =  jL,'Hx,9)f(x)dx  +  qp  (5,9)  , 


Theorem  2  entails  that,  uniformly  in  9  e  B,  -n  £^(0)  >  J(9)  w.p.l. 

Assuming  J(9)  continuous  at  9  ,  it  then  follows  that  T  -*■  J(9  )  w.p.l. 

o  no 

In  order  to  relate  J(9q)  to  J  we  differentiate  J ^f ( x , 0 ) dx  +  F (c,0)  =  1 
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twice  wrt  8  and  obtain  the  identity 

-  jf.  J(x,0)f(x,0)dx  -  F(5,0)JS(c,8) 

V 

”  J ^  $(x,0 )$*(x,0 )dx  +  [F(£,0)]  1 

(jf.  ♦(x,8)f(x,0)dx)(jfoo  $(x,0)f(x,0)dx)*  . 

For  0  *  0Q,  the  lhs  equals  J(0Q)  while  the  rhs  equals  J.  This  concludes 

1/,  * 

the  proof  of  the  asymptotic  normality  of  nz  (0^-0^. 

Turning  now  to  the  issue  of  existence  of  a  strongly  consistent  sequence 

A 

of  roots  {9n) /  we  examine  the  limiting  behavior  of  the  log  likelihood  ratio 

n  1  [i  (0)  -  i  (0  )].  This  is  again  of  the  form  T  (0)  in  (2.1),  now  a  real- 
n  n  o  n 

valued  function,  with  the  special  g  and  h  given  by 

g  =  logtf (x,0)/f (x)3 ,  h  =  log[F(x,0)/F(x)] 

Theorem  2  entails  that 

n^ti  (0)  -  l  (0  )]  -*■  U  (8 )  5  J5  logtf (x,0)/f(x)]f(x)dx 

n  n  °  (3.6) 

+  q  log[F(C,8)/q]  , 

w.p.1  and  uniformly  in  0  e  B.  Evidently  m(0o>  *  O*  To  show  that  w (0 )  has 
a  local  maximum  at  0Q,  we  fix  0  6  B,  define  the  function  u(x)  as 

u(x)  =  f(x,0)/f(x)  ,  x  <  C 

=  F(?,0)/q  ,  x  >  C 

and  let  Z  be  a  random  variable  whose  distribution  has  the  pdf  f(x)  on 
(-•,0  and  a  point  mass  q  at  x  =  £ .  Then 

U(9)  =  E[log  u(Z)] 

<  logtE  u(Z)]  =  0 

by  Jensen's  inequality  and  the  fact  that  E  u(Z)  =  F(£,0)  +  F(£,0)  =  1.  The 
inequality  is  strict  for  a  0  f  0Q  once  we  impose  the  identif iability 


condition: 


x  <  ?]  >  0 


hv> 


rv 


Pg  Cf (x,0 )  +  f (x,0Q)  , 

O 

In  view  of  this  and  (3.6),  the  standard  argument  then  leads  to  the  existence 
of  a  strongly  consistent  sequence  of  roots. 

Remark .  The  collection  of  regularity  conditions  used  in  course  of  our 
proofs  is  essentially  the  same  as  given  by  Halperin  (1952)  with  the  exception 
that  the  third  derivative  was  not  needed  in  our  treatment.  Also,  the  second 
derivative  was  not  used  in  the  consistency  proof.  On  the  other  hand,  we  have 
assumed  JJ(x,6)  to  be  continuous  on  [£-e,;+e]  x  B.  Also,  we  have  formalized 
the  identifiability  condition  which  was  not  explicitly  addressed  by  Halperin 
(1952). 


4.  APPLICATION  TO  MMLE  OF  MEHROTRA  AND  NANDA 

With  $(x,0)  and  p(x,0)  defined  in  (3.1),  let 

c  (0)  5  E.p(Y  ,0)  :  fl  ♦  Rk 
n  or 

and  let  3  be  a  solution  of  the  estimating  equation 


(4.1) 


!in<e>  =  1  $<V0)  +  <n"*>cn<0)  -  o 


(4.2) 


Mehrotra  and  Nanda  (1974)  derived  expressions  for  under  the  normal  and 

gamma  models,  and  examined  some  exact  properties  including  the  bias  and 

variance.  The  object  of  this  section  is  to  derive  the  asymptotic  properties 

of  this  modified  maximum  likelihood  estimator  (MMLE)  including  an 

expression  of  its  limiting  covariance  matrix.  Although  the  currently  known 

applications  are  confined  to  location  and  scale  parameters,  the  asymptotics 

can  be  treated  for  general  parameters  without  added  complication. 

The  asymptotic  normality  of  follows  along  the  lines  of  Section  3 

with  appropriate  adaptations  of  Theorems  1  and  2.  However,  an  additional 

1/2  • 

result  in  regard  to  the  first-mean  convergences  of  n^p(Yr,9)  is  needed. 


-1  1- 


.v  a 


This  is  stated  in  Lemma  2  and  a  proof  is  given  in  the  Appendix.  Also,  in 
addition  to  the  assumptions  of  Section  3,  some  smoothness  conditions  will  be 


needed  for  the  functions  cn(8).  Specifically,  we  assume  that  the  k  x  k 

matrix  of  the  partial  derivatives  cMO)  converges  to  a  limit  v(0) 

uniformly  in  8  e  B,  and  that  v(8)  is  continuous  at  0q. 

Lemma  2.  If  the  function  f  (x)  =  f  (x,8  )  :  X  +  R  is  bounded,  then 

.  — — — —  o 

lim  tl2  tc  (8  )  -  p(C,8  )1  =0. 
no  o 

To  arrive  at  the  asymptotic  distribution  of  we  first  introduce  some 

notation: 

o  =  ♦(£)  -  p  1  J^+fdx 


T1  =  jy  f*fdx  “  P^UyfdxHjfJtfdx)* 


E1  =>  T1  +  pqaa* 

J1  =  -J^fdx  -  qv  ,  V  *  v(0q) 


A 


-1 

1 


* 

) 


(4.3) 


It  is  to  be  noted  that  the  matrix  v  and,  a  forteriori,  J.,  are  not 
necessarily  symmetric. 

Theorem  3.  If  f§n)  be  a  consistent  sequence  of  roots  of  the  MML 

equation  (4.2),  and  the  aforementioned  conditions  holds,  then 
1/_  ->  V 

n  2  («  -8  )  ♦  N  (0,A)  where  A  is  given  in  (4.3). 
n  o  k  3 

Outline  of  proof.  In  line  with  the  treatment  of  MLE  in  Section  3,  we 
begin  with  a  Taylor  expansion 

-  V"’*(VV'  <4- 

where  the  matrix  Mn  corresponds  to  -n”^3l^n(0)/30  =  M^tS).  To  establish 
J/  •  P 

that  n  /2*fn(9)  ♦  1^(0, E,j),  we  refer  to  the  rhs  of  (4.2),  apply  Theorem  1 
with  g  *  $  and  h  *  0  in  order  to  handle  the  random  term,  and  use  Lemma  2 


to  the  non-random  sequence  cn(8Q).  That  the  limiting  mean  is  0  follows 
from  the  relation 

lU fdx  +  qp(?)  -  0  . 

Also,  the  above  choices  of  g  and  h  in  Theorem  1  lead  to  the  covariance 
matrix  defined  in  (4.3). 

•• 

Next,  consider  M  (6)  and  employ  Theorem  2  with  g  *  \f>  and  h  =  0. 
n 

Under  the  stated  assumption  about  the  uniform  limit  of  £^(8),  it  then 
follows  that 

Mn<0)  ♦  J^O)  =  -J^^(x,0)f (x)dx  -  qv( 8 ) 
w.p. 1  and  uniformly  in  8  e  B.  Finally,  note  that  J^(0)  is  continuous  at 
0Q,  and  J1(0Q)  *  defined  in  (4.3).  The  proof  is  completed  by  combining 
these  results.  I | 

As  for  the  existence  of  a  strongly  consistent  sequence  of  MMLE  3^,  a 

simple  criterion  can  be  provided  when  8  is  real-valued.  In  this  case,  the 

equality  l.,n(Bn)  “  0  can  be  viewed  as  a  necessary  condition  for  the 

maximization  of  a  pseudo-likelihood  function  defined  as 

r 

f1n(0)  =  l  ♦(Yi,8)  +  (n-r)Cn(8) 

where  C  (0)  =  I  c  (0)d0.  One  can  then  employ  Theorem  2  to  deduce  that 
n  J  n 

n_1U,  (0)  -  l.  (0  )]  +  JJ .  (0 )  5  log(f(x,0)/f(x)]f<x)dx 

in  »n  o  *  /  *  e 

(4.  5 

+  q[C(8)  -  C( 0Q ) ] 

w.p. 1  and  uniformly  in  0  e  B.  Then  pursuing  the  same  lines  of  reasoning  as 
for  the  MLE,  we  have 

Theorem  4.  Assume  0  is  real-valued  and  U^(0)  as  defined  in  (4.5). 

If  ®  and  (1^(6)  is  strictly  concave  in  a  neighborhood  of  0q, 

then  a  strongly  consistent  sequence  of  MMLE  exists. 

A  simple  criterion  such  as  Theorem  4  does  not  emerge  in  the  case  of  a 


vector  parameter  for  the  reason  that  the  construction  of  a  pseudo-likelihood 
is  not  generally  feasible*  One  would  need  to  employ  appropriate  special 
methods  to  handle  the  individual  problems. 


5.  ASYMPTOTIC  EFFICIENCY  RESULTS  FOR  LOCATION  AND  SCALE 
The  function  cn(8)  takes  a  simple  form  when  6  is  either  a  location  or 
scale  parameter.  This  is  why  the  Mehrotra  and  Nanda  MMLE  has  been  found 
convenient  for  these  cases,  especially  under  the  normal  distribution.  In  this 
section  we  derive  the  asymptotic  efficiency  (AE)  of  ?  relative  to  the  MLE 

A 

for  the  location  and  scale  models  that  are  'regular'  in  the  sense  that  the 
conditions  of  the  preceding  sections  hold.  Numerical  values  of  the  AE  are 
then  computed  for  the  normal  model,  and  the  effect  of  the  amount  of  censoring 
is  examined. 

5. 1  Location 

Here  f(x,0)  *  f(x-8)  and  we  take  0Q  =  0  without  loss  of  generality 

because  both  and  0^  are  equi variant.  Henceforth,  we  use  a  prime  to 

denote  differentiation  wrt  x  while,  as  before,  an  upper  dot  denotes  a 

derivative  wrt  0.  Defining,  ty(x)  *  log  f(x),  and  the  hazard  rate  function 

X(x)  =  f(x)/F(x),  we  have 

$(x,0)  =  -^'(x-0),  p(x,8)  =  X(x-0)  , 

c  (0)  =  E  X (Y  )  =  c  .  (5,1) 

n  o  r  n 

Here  c  (0)  is  free  of  0  so  c  (0)  =  0  and  C  (0)  =  8c  .  Also,  lim  c_  = 
n  n  n  n  n 

X(?). 

For  consistency  of  $  ,  we  refer  to  (4.5)  and  obtain 

n 

P^O)  =  ;^[*(x-0)  “  $(x)]f(x)dx  +  qX(?)0 
whose  derivatives  at  8  =  0  are  p^O)  -  0,  and  p^O)  •  (x)f (x)dx. 

In  order  that  p^(0)  be  negative,  a  simple  sufficient  condition  is  that 
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* 


log  f(x)  be  concave.  Hence,  the  existence  of  a  consistent  MMLE  8 :i  is 
ensured  by  the  condition  that  the  pdf  f(x)  is  strongly  uniraodal. 

For  the  asymptotic  normality,  we  further  assume  that  f'(x)  is  bounded 
so  Lemma  2  and  Theorem  3  hold.  Noting  that  (x)f  (x)dx  ■  f(C),  and  using 

(4.3)  we  obtain 


a  -  -**(?)  +  p  f(C) 


Ti  "  J^t*'(x))2f{x)dx  -  p-Vcc) 


(5.2) 


J1  "  -i5c.+  M(x)f(x)dx  . 


Let  AV(?n)  denote  the  asynptotic  variance  of  rfi  (0  -8  )•  Osing  the 
expression  for  A  given  in  (4.3),  we  have 


AV(0  ) 
n 


JX-H>'(x)rf(x)dx  “  p“  f  (C)  +  pq[p  f(C>  -  ♦'(?)]■ 

(x)dx]2 


(5.3) 


An  alternative  representation  of  this,  in  terms  of  truncated  moments,  will  be 
convenient  for  calculation.  With  denoting  a  random  variable  which  has 

the  (truncated)  pdf  p-1f(x),  x  <  ?,  (5.3)  reduces  to 


AV(8  ) 
n 


Var  *‘(X  )  +  q[Ety'  (X  )  -  f(C)l‘ 
p[E*"(X?)]2 


(5.4) 


The  asymptotic  variance  of  n^lS^-fl),  denoted  by  AV(0n),  is  similarly 


obtained  from  the  general  expression  in  Section  3: 

AV(0  )  -  J_1  -  p-1[Var  ♦'(X,.)  +  g-1E2t'(Xr )]  .  (5.5) 

The  AE  of  $  is  then  given  by  AV(0  )/AV(0  ).  These  are  computed  for  the 
n  n  n 

normal  distribution  in  Example  1 . 

Example  1 .  Consider  f(x)  *  ♦(x),  the  standard  normal  pdf,  and  let 
•(x)  denote  its  df.  Here  ♦,(x)  *  -x,  ♦"(x)  *  -1,  and  the  asymptotic 
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variances  involve  the  truncated  moments 


p-1J^x^(x)dx,  j  *  1,2,..., 


Specifically/  the  AE 
is  given  by 


e.(£)  of  the  Mehrotra  and  Nanda  MMLE  0 
1  n 


(5.6) 


in  this  case 


e^c)  *  t(C2  +  q^Ho2  +  qU  -  P^)2)]"1  •  (5.7) 

2 

Numerical  computation  is  simplified  by  the  fact  that  both  p^  and  a ^  can  be 
expressed  in  terms  of  ♦  (?)  and  ♦  (?)  by  integration  by  parts.  In 
particular# 


P?  -  -♦(?)/*(«;)  /  a2  »  1  -  c+<0/*(0 


(5.8) 


o2  «  1  -  C*(C)/*(C>  “  [♦(5)/*(C>]2  • 

Table  1  presents  the  values  of  e^(C),  AVtB^)  and  AV(?^)  for  different 
5 's  and  corresponding  p's.  Note  that  smaller  values  of  p  correspond  to 
the  increased  severity  of  censoring.  Although#  as  p  ♦  0,  the  limiting  value 
of  e^C)  is  zero.  Table  1  shows  an  extremely  slow  approach  to  this  limit.  A 
very  high  efficiency  is  retained  even  for  50%  censoring. 

5.2  Scale 

We  consider  f(x,8)  =  0  ^(x/O)  and  take  0*1  without  loss  of 
generality.  With  ♦(x)  and  X(x)  defined  as  before,  we  have 


<p(x,0)  =*  -log  0  +  ^(x/0)  , 

i(x,8)  =  -0_1  -  0~2x*'(x/0)  , 

p(x,0)  =  8  2xX(x/0) 


(5.9) 


Also,  c  (9)  -  where  k  ■  E«  [Y  X(Y  )]  is  free  of  9,  and  lim  k_  = 

n  n  n  i  r  r  11 

gX(C).  Consequently,  cn(9)  “  ”^n^  Cn^^  *  ^n  an<*  v^®0^  *  “C^(C)» 

In  this  case,  expression  (4.5)  becomes 

y  ^  (9  >  -  -p  log  9  +  [\|»(x/9)  -  ♦(x)]f(x)dx  +  qCX(C)  log  9 

whose  derivatives  at  1  are  W  ^ ( 1 )  ■  0 ,  and 

ii  1  ( 1 )  -  W<x)  +  xV(x)]f(x)dx  • 

Therefore,  a  simple  sufficient  condition  for  the  strong  consistency  of  the 

MMLE  $  is  that  [x$'(x)  +  x2^"(x)l  <  0  for  all  x.  For  the  standard 
n 

5 

normal  distribution,  this  function  reduces  to  -2x  so  the  condition  holds. 
Incidentally,  it  can  be  seen  that  if  X  is  a  positive  random  variable,  this 
condition  is  equivalent  to  strong  unimodality  of  the  pdf  of  log  X. 

The  asymptotic  variances  AV(9n)  *"<*  AV(?n)  again  follow  from  the 
general  results  obtained  in  Sections  3  and  4  as  we  specialize  to  the  functions 
given  in  (5.9).  The  algebra  is  straightforward  although  somewhat  more  tedious 
than  the  location  case.  Here,  use  of  Lemma  2  requires  that  f(x)  and 
xf'(x)  be  bounded.  Some  relevant  expressions  are 

o  -  ”C9'(C)  +  p"1  xV(x)f(x)dx  , 

T1  *  JL  t1  ♦  x*'(x)]2f(x)dx  -  p”1  (1  +  x*'(x))f(x)dx]2  , 

(5.10) 

Jt  -  -J^  [1  +  2x**(x)  +  x2*"(x)]f(x)dx  +  qCX(C)  , 

J  -  (1  +  x*’(x)]2f(x)dx  +  q"1;2f2(C)  . 


From  these  basic  quantities,  the  asymptotic  variances 

AV(9  )  =  (T  +  pqa  )/J 
n  I  i 

.  (5.11) 

AV(9  )  *  1/J 
n 


are  readily  obtained 


♦  '(x) 


Example  2.  For  the  normal  distribution  f(x,0)  *  we  have 

■  -x,  ♦"(x)  ”  -1,  and  the  expressions  (5.10)  reduce  in  terms  of  the 

2 

truncated  moments  aj  defined  in  (5.6).  Denoting  v4  =  a4  -  a2,  we  obtain 
a  *  C2  -  a2  ,  t  =  pv4 

J1  “  2pa2  ,  J  *  p[v4  +  q  1  ( ^2—  1 ) 21 

Consequently, 

AV(?n)  -  p_1[v4  +  q(C2-a2)] (2a2)"2  , 

AV(9  )  =  p"1^  +  q“1(a  -I)2]'1  ,  (5.12) 

n  4  t 

e,(C)  S  AE(?  )  =  AV(0  )/AV(§  )  . 

2  n  n  n 

Numerical  computations  are  presented  in  Table  2.  As  in  the  case  of  MMLE 
of  a  normal  mean  studied  in  Example  1 ,  the  MMLE  of  the  standard  deviation  does 
not  incur  much  loss  of  AE  when  p  is  large,  that  is,  censoring  is  light. 

Also,  the  AE  tends  to  decrease  in  the  extremely  low  range  of  p.  However,  it 

is  curious  that  unlike  the  monotone  behavior  found  in  Example  1,  here 
AV(?n)  and  e2^  have  humps  over  an  intermediate  range  of  p. 


6.  APPLICATION  TO  THE  MMLE  OF  TIKU 

In  the  context  of  estimating  the  mean  and  standard  deviation  of  a  normal 
distribution  from  a  type  II  censored  sample,  Tiku  (1967)  proposed  a  linear 
approximation  of  the  hazard  rate  term  that  appears  in  the  likelihood  equation. 
Asymptotic  normality  and  efficiency  of  this  type  of  MMLE  can  also  be  obtained 
from  the  basic  tools  developed  in  Section  2.  It  turns  out  that  not  only  for 
the  normal  model  but  for  a  general  location-scale  model  as  well,  this  type  of 
MMLE  is  asymptotically  fully  efficient.  To  indicate  why  this  is  so,  it  would 
suffice  to  consider  the  location  model  f(x,9)  =  f(x-0)  in  which  case  the 


likelihood  equation  is 

r 

i  (0)  -  -  l  ♦  '(Y  -0)  +  (n-r)X(Y  -9)  -  0  (6.1) 

n  i-1  1  r 

where  X(x)  •  f(x)/T(x).  Tiku'e  WtLB  reeulte  from  replacing  X(Yr-9)  in 
(6.1)  by  its  linear  approximation  h^Y^-9)  ■  a(Yr-0)  +  b  with  a  ”  X’(C) 
and  b  -  X(c )  "  CX'(().  Referring  to  the  development  in  Section  3,  in 
particular,  expressions  (3.2)  and  (3.4),  note  that  the  function  h(x)  *  X(x) 
is  now  changed  to  h^(x)  ■  ax+b.  However,  with  a  and  b  specified  above, 
we  have  h(C)  -  h^C)  end  h'(C)  ■  h}(C)  so  the  results  in  (3.5)  do  not 
change.  Likewise,  in  place  of  (T(C,9)  ■  -h'(C_f/  in  (3.6)  we  now  have 

“  -h'(C).  However,  their  difference  ♦0  as  0  ♦  0  so  we  have  the 
same  J(0Q),  and  hence  the  same  asymptotic  variance  as  for  the  MLE .  This 
clarifies  Tiku's  (1978)  heuristic  reasoning  that  the  modified  likelihood 
equation  is  "asymptotically  equivalent”  to  the  original  likelihood  equation. 


APPENDIX 


Proof  of  Lemma  2.  For  simplicity  we  suppress  9Q  in  notation,  and  first 

consider  the  case  where  6  is  real-valued.  Let  f  ( • )  denote  the  pdf  of 

r  ,n 

Y„  the  r-th  order  statistic  of  a  random  sample  Z«,...,Zn  from  F(x). 
if  n  1  41 

•  •  — 

Since  p(x)  -  -F(x)/F(x),  we  have 

-c  =  l"  F(x)[F(x)]  'f  (x)dx 
n  r , n 


(n/(n-r)J  /"  F(x)f  .(x)dx 

r  f  n- 1 

[n/(n-r)l  E  a(Y^  .) 

r ,  n-  i 


where  a(x)  =  F(x).  Since  p(£)  *  -q  a(?)  and  (n-r)/n  ♦  q,  it  would 


suffice  to  establish  that 

lim  n/z  [a(C)  -  E  a(Y 


)]  *  0  . 


(A.1) 


Let  sn(*)  denote  the  empirical  cdf  of  Z1,...,Zn*  Using  a  repre¬ 
sentation  due  to  Bahadur  (1966)  we  write 


r 

where 

V 

n  ♦  • 

■-Y 

iS 

Y  =  £  +  V  +  R 
r  ,n  n  n 

o  s  nf  n~^/4 1 


n/2  [a(Y  )  -  a(C)  -  V  a' U>3  *  W.  +  W 


In  2n 


(A. 2) 


where 


W,  *  n/zV  [a’(C  )  -  a'(?)]  , 

in  n  n 


W_  *  n 2  a' (?  )R  , 
Zn  n  n 


(A. 3) 


2  2 

and  Cn  lies  between  ?  and  ^r,n*  Since  EV^  =  pq/[nf  (?)],  Cn  ♦  C  w.p.1, 

and  a'(x)  =  £(x)  is  assumed  bounded,  we  have 

E2|W,  |  <  f ~2(?  )pq  E(a'(?  )  -  a '  (? )  ]  2  ♦  0  . 

in  n 


-20- 


Duttweiler  (1973)  establishes  the  order  of  the  mean-square  of  Bahadur's 

approximation.  His  result  entails  that 

ER2  -  f‘2(0(2pq/ir)1^n'3/2[1  +  o(D] 
n 

2 

and,  consequently,  ®W2n  *  Thus,  we  have  established  that  E|W^n|  ♦  0  and 

E|K2nl  ♦  0  which  imply  that  the  lhs  of  (A. 2)  converges  to  0  in  the  first 

mean.  Since  *V„  -  0  for  all  n,  the  result  (A.1)  follows, 
n 

If  0  is  vector-valued  so  are  cn  and  p(?)»  and  the  above  argument 
applies  to  each  coordinate  of  t/*  [c^  -  P (C ) 1  -  II 


Table  1 . 


Asymptotic  Efficiency  of  the  Mehrotra  and 
Nanda  MMLE  for  a  Normal  Mean  9 


AV  ( 0 ) 


AV  ( 0 ) 


e^C) 


Table  2.  Asymptotic  Efficiency  of  the  Mehrotra  and 

Nanda  MMLE  for  a  Normal  Standard  Deviation  0 


2.3263 

1.6449 

1.2816 


AV(  9  ) 

AV(8) 

e2(C) 

2.3729 

3. 1241 

.7595 

1.3177 

1.4939 

.8820 

1.1758 

1.2333 

.9533 

1.1469 

1.1469 

1.0000 

1.1364 

1.1763 

.9661 

1.0868 

1.2250 

.8872 

1.0000 

1.2500 

.8000 

.8930 

1.2128 

.7363 

.7823 

1.0872 

.7195 

1.2816 

1.6449 


2.3263 
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